perm filename ABST2[X,ALS] blob
sn#075317 filedate 1973-12-04 generic text, type T, neo UTF8
00100 The Amanuensis Speech Recognition System
00200
00300 by
00400
00500 James L.Hieronymus
00600 Neil J. Miller
00700 Arthur L. Samuel
00750
00760 Stanford A.I. Laboratory, Stanford University
00780
00782 Abstract
00785
00800 The Amanuensis speech recognition system under development at the
00900 Stanford A.I. Laboratory is a signature-table oriented system that
01000 uses machine learning techniques and attempts to extract a maximum
01100 amount of linguistic information from the acoustic speech signal. It
01200 differs from the system previously reported in a number of important
01300 respects:
01400 1) A new acoustic segmenter is used to extract prosodic
01500 features from the acoustic input and to isolate regions for especial
01600 treatment.
01700 2) Parameters for all voiced regions are determined pitch
01800 synchronously using a new glottal pulse locator.
01900 3) Use is made of information from both the steady or near
02000 steady state regions and from the transition regions.
02100 3) Speaker normalization is done, partly by formula and
02200 partly by signature tables.
02300 4) Greater use is made of the redundancy of speech to improve
02400 the recognition.
02500 5) Improvements have been made in the design and use of the
02600 signature tables both to improve their accuracy and to achieve a
02700 better compromise between the need for excessive amounts of training
02800 material and the need for smoothing.
02900 6) A bootstrapping technique is under study which should
03000 greatly reduce the amount of hand segmentation necessary to provide
03100 the anotated training material.
03200 7) Several possible output streams of phonemes are produced
03300 with probability ratings for both the complete streams and for the
03400 individual phonemes, so that it should not be necessary ever to go
03500 back to the original acoustic input data to resolve ambiguities and
03600 to incorporate syntactic, semantic and contextual information in the
03700 decision process.